A Novel Field Approach to 3D Gene Expression Pattern Characterization 
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We present a vector field method for obtaining the spatial organization of 3D patterns of gene 
expression based on gradients and lines of force obtained by numerical integration. The convergence 
of these lines of force in local maxima are identified as centers of gene expression, providing a natural 
and powerful framework to characterize the organization and dynamics of biological structures. We 
apply this novel methodology to analyze the expression patterns of light chain myosin II protein 
linked to enhanced green fluorescent protein (EGFP) during zebrafish heart formation. 
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Animal development involves synchronized gene acti- 
vation modulated by environmental influences P, • Far 
from being uniform, such a gene expression gives rise to 
structured spatial and temporal patterns of varying pro- 
tein concentration. Recent advances in biochemical and 
imaging methods have paved the way to obtaining 3D 
reconstructions of spatial gene activation 3] which can 
be analyzed in order to better understand the intricate 
mechanisms governing tissue, organ and member forma- 
tion 0. Among the several currently available method- 
ologies allowing characterization of 3D gene expression, 
special attention has been given to EGFP (Enhanced 
Green Fluorescence Protein). The EGFP is used as a 
marker. Its expression is controlled by the promoter of 
the gene of interest creating a fluorescent fusion protein 
that maintains the normal functions and localization of 
the wild type protein. This methodology can be used 
to demonstrate gene activity in intact cells and organ- 
isms, while taking into account the fact that the host 
protein is continuously synthesized, degraded, and suf- 
fering alterations within cells 0, H| . As such a type of 
gene expression data becomes available, it is important to 
identify and develop mathematical methologies for mea- 
suring and modeling spatial gene activation. In addition 
to traditional approaches (e.g. density or dispersion es- 
timation), it is important to consider more sophisticated 
methods capable of addressing more directly aspects re- 
lated to the dynamics of the involved biological processes, 
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such as cell communication and migration [6|, |jj , which 
play an important role during both embryonic develop- 
ment and pathological processes. 

In this article we characterize the spatial organization 
of gene expression patterns in order to assess the geo- 
metrical basis of some dynamical processes during mor- 
phogenesis. To this end, we compute a "gene expression 
landscape" as a scalar field oj = g(x, y, z), where u> is in- 
terpreted as the amount of expression of the protein in 
the spatial position (x,y, z). The same approach can be 
used to model and predict the dissemination of cell sig- 
nalling or other influence factors emmanating from the 
cell under analysis which, combined with the possibility 
of adopting varying values of the parameters affecting the 
field (e.g. the dielectric constant), defines a truly general 
framework for expressing field influences. In analogy with 
the potential dynamics of dissipative systems, we obtain 
the spatial trajectories (lines of force) corresponding to 
maximizing the gradient of gene expression. Such trajec- 
tories tend to converge to local peaks of activity, defining 
gene expression centers. It is proposed in this article that 
the distribution of such centers provide a natural frame- 
work for characterizing and analyzing the spatial inter- 
actions between the involved developmental rudiments. 
The potential of such a methodology is illustrated with 
respect to the analysis of zebrafish heart formation from 
3D gene expression data. 

Zebrafish embryos have been widely used in order to 
study heart formation, due to their transparency and its 
partial independence from the cardiovascular system. For 
vertebrates, the heart is the first organ that forms and 
starts operating ||. Constrictions and bending (fold- 
ing) are key elements in the early morphogenetic shap- 
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ing of the heart tube. The spatial gene expression data 
considered in this work was acquired through the obser- 
vation of 42-hour post-fertilization transgenic zebrafish 
embryos expressing EGFP specific for heart mesoderm 
myosin light chain (mlc2a-EGFP) [9|. The zebrafish em- 
bryos were anesthetized and kept fixed, and live-images 
of the heart were taken at ambient room temperature. 
The image recordings were made using a Nikon Eclipse 
TE300 inverted microscope using 20x/0.75 NA magnifi- 
cation. The microscope is coupled to a Bio-Rad Radiance 
MP 2100 scanning multiphoton confocal system (Cam- 
bridge, MA) with a two-photon Tsunami laser (Spectra 
Physics, CA). The GFP was excited with the two-photon 
laser, at 900 nm. The total dataset is composed of 110 
confocal sections. 

All the 110 confocal slices were combined so as to 
obtain the three-dimensional volume of the heart, from 
which the gene expression landscape was computed as 
described above. It is interesting to note that this scalar 
field can be visualized with direct volume rendering al- 
gorithms (DVR) 0| ■ I n order to minimize the spatial 
quantization noise implied by digital image representa- 
tion, gaussian smoothing was applied over the gene con- 
centration data. This is done through the discrete con- 
volution of a three-dimensional Gaussian kernel k(x,y,z) 
with the scalar field w, as expressed in Eq. 

w(x,y,z)*k(x,y,z) = 22w(i,j,k) 

i,j,k 

x k((x-i),(y-j),(z-k))(l) 

The smoothed reconstruction of the 3D gene activ- 
ity pattern is shown in Figure The gradient of 
this scalar field was estimated by using the enhanced fi- 
nite differences scheme described in 11], by convolving 
the gene expression concentration with three-dimensional 
masks. Next, we compute the lines of force by calculat- 
ing the trajectories that maximize the gradient starting 
form arbitrary spatial positions sampled as points uni- 
formly distributed through spheres centered at the three- 
dimensional volume. 

The considered lines of force would correspond, for in- 
stance, to the putative path (set of 3D coordinates) fol- 
lowed by an object at position f = [x, y, z) with gradient 
dissipative dynamics: 

dr -> 

— = V{u)(x,y,z)*k(x,y,z)}, (2) 

standard numerical integration was used in order to 
estimate such lines of force, which are illustrated in Fig- 
ure \T]p. The sampling criteria removed the lines whose 
scalar value of its end point were less than 10 (from a 
range of to 255) , eliminating those that do not reach the 
regions where mlc2a was being expressed. Small and too 
long trajectories were also removed, because they were 
influenced by noises. As expected, these lines converge 



to local maxima of the scalar gene expression field, which 
could be considered as gene expression centers. In anal- 
ogy to graph theory, the total number of sampled lines of 
force converging to a specific center is referred to as the 
center degree. A total of 734 lines and 89 centers were 
obtained for the considered 3D gene expression data. 

Figure ^ shows the sampled lines of force obtained by 
using the above described methodology, drawn in black or 
white according to thresholding criteria: the lines corre- 
sponding to gene expression activity centers with degrees 
smaller than 14 have been marked in white. Such thresh- 
old value was defined with basis on the relative frequency 
histogram of the distribution of centers degree, showed 
in Figure [3 It can be seen from Figure QJ) that the genie 
activity centers exhibiting higher numbers of converging 
lines of force (marked black) tend to concentrate along 
the regions subjected to the constriction and folding im- 
plied by the heart formation dynamics (marked by ar- 
row 1 in Figure ^i) as well as the sinus venosus (marked 
by arrow 2 in Figure QJi). The following biological in- 
terpretation are suggested in order to account for such 
result. 

The heart forms from a tube of epimyocardial cells 
that express, among other genes, mlc2a. This gene is 
expressed uniformly throughout the heart, with the pos- 
sibility of a weaker expression in the inflow pole, i.e. the 
region of the venous sinus and the atrium (Figure [IJ,). It 
is suggested here that the distribution of active cells could 
be determined by a gene activity field in such a way that 
the higher degree activity centers positively regulates the 
activation patterns of surrounding cells. The line of force 
pattern indicates that the expression of these cells co- 
incides with morphogenetic events of heart formation, 
in particular the characteristic constrictions and bend- 
ings of the heart tube at the atrio-ventricular and the 
vcntriculo-bulbar borders (arrows in Figure^,), which 
are sites composed by high degree activity centers, in- 
volving cells more actively producing mlc2a. This pro- 
cess might be affected by the differential distribution of 
gene activation centers, as indicated by the respective 
numbers of lines of force which tended to be smoother at 
these locations. 

While such hypotheses can only be verified through 
further experimental investigations, a novel methodology 
for 3D gene activity characterization has been shown to 
provide a natural and effective means for quantifying the 
spacial interactions between the biological structures in- 
volved in gene expression. Unlike differential measure- 
ments such as gradients or divergent magnitudes, the es- 
timation of the lines of force and activity centers are inte- 
gral features, indicating spatial interactions over substan- 
tial distances. It is expected that the proposed frame- 
work will prove to be useful in a number of other gene 
expression investigations, paving the way to a more ob- 
jective understanding of the dynamics governing animal 
development and its pathologies. 
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FIG. 1: (a) Visualization of the smoothed and reconstructed gene activity pattern of mlc2a during zebrafish heart formation. 
The inflow pole is on the upper left. Arrows 1 and 2 indicates the constriction/bending regions. The respective lines of force 
are shown in (hi. segregated into black and white as described in the text. 
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FIG. 2: Relative frequency histogram of the distribution of 
centers degree 
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